DNA Research — Latest Matching Preprints

1

Chromosome-scale genome assembly of Eustoma grandiflorum, the first complete genome sequence in family Gentianaceae

Shirasawa, K.; Arimoto, R.; Hirakawa, H.; Ishimorai, M.; Ghelfi, A.; Miyasaka, M.; Endo, M.; Kawabata, S.; Isobe, S.

2021-09-11 genomics 10.1101/2021.09.09.459690 medRxiv

Top 0.1%

75.4%

Show abstract

Eustoma grandiflorum (Raf.) Shinn., is an annual herbaceous plant native to the southern United States, Mexico, and the Greater Antilles. It has a large flower with a variety of colors and an important flower crop. In this study, we established a chromosome-scale de novo assembly of E. grandiflorum by integrating four genomic and genetic approaches: (1) Pacific Biosciences (PacBio) Sequel deep sequencing, (2) error correction of the assembly by Illumina short reads, (3) scaffolding by chromatin conformation capture sequencing (Hi-C), and (4) genetic linkage maps derived from an F2 mapping population. The 36 pseudomolecules and unplaced 64 scaffolds were created with total length of 1,324.8 Mb. Full-length transcript sequencing was obtained by PacBio Iso-Seq sequencing for gene prediction on the assembled genome, Egra_v1. A total of 36,619 genes were predicted on the genome as high confidence HC) genes. Of the 36,619, 25,936 were annotated functions by ZenAnnotation. Genetic diversity analysis was also performed for nine commercial E. grandiflorum varieties bred in Japan, and 254,205 variants were identified. This is the first report of the construction of reference genome sequences in E. grandiflorum as well as in the family Gentianaceae.

2

Genome assembly of Firmina major, an endangered savanna tree species endemic to China

Yang, J.; Zhang, R.; Ma, Y.; Ma, Y.; Sun, W.

2024-09-13 genomics 10.1101/2024.09.09.610897 medRxiv

Top 0.1%

70.4%

Show abstract

The tree species Firmiana major was once dominant in the savanna vegetation of the arid hot valleys of southwest China, but was considered extinct in the wild in 1998. After eight small populations were relocated by thorough investigations between 2018 and 2020, the species was subsequently recognized as a Plant Species of Extremely Small Populations (PSESP) in China in need of urgent rescue. Moreover, due to severe human disturbance, other species in the tropical woody genus Firmiana are also endangered, and the species in this genus have almost all been listed as second-class National Protected Wild Plants in China. In order to guide future research into the conservation of this group, we present here the high-quality genome assembly of F. major. This is the first genome assembly in the genus Firmiana, and is 1.4 Gb in size. The assembly consists of 1.18 Gb repetitive sequences, 37,673 annotated genes and 31,965 coding genes.

3

Chromosomal-level genome assembly of Populus adenopoda

Liu, S.; Wang, Z.; Shi, T.; Dan, X.; Zhang, Y.; Liu, J.; Wang, J.

2023-07-12 genomics 10.1101/2023.07.11.548479 medRxiv

Top 0.1%

66.2%

Show abstract

High-quality reference genomes for several species have promoted breeding and functional studies of poplar trees. By resequencing numerous accessions of these and closely related species, single nucleotide polymorphisms (SNPs) and small insertion/deletions (InDels) have been identified to assist in clarifying local adaptation and phenotypic diversification. A chromosome-level genome assembly for P. adenopoda was assembled based on Illumina and PacBio sequencing platforms, facilitated by Hi-C technology. The assembled genome size was about 383 Mb, with 99.70% of the contigs anchored to 19 pseudo-chromosomes, and a total of 33,505 protein-coding genes were annotated. This high-quality genome provided the genomic basis for the subsequent detection of various variants.

4

Haploid-resolved and chromosome-scale genome assembly in Citrus unshiu and its parental species, C. nobilis and C. kinokuni

Isobe, S.; Fujii, H.; Shirasawa, K.; Kawahara, Y.; Endo, T.; Shimada, T.

2023-06-05 genomics 10.1101/2023.06.02.543356 medRxiv

Top 0.1%

64.7%

Show abstract

Citrus, a member of the Rutaceae family, is a widely cultivated crop with numerous cultivars. In Japan, citrus fruits account for a significant portion of agricultural production. Although several new citrus varieties have been developed through conventional breeding programs, satsuma mandarin remains the dominant cultivar. In this study, chromosome-scale and haploid-resolved reference genome sequences of satsuma mandarin (Citrus unshiu Marc) and its parental varaieties, kishu mandarin (C. kinokuni hort. ex Tanaka) and kunenbo mandarin (C. nobilis Lour. var. kunip Tanaka) were generated using long-read sequencing and Hi-C technologies. The comparison of haploid and unphased genomes revealed structural differences between them, indicating distinct regions in each haploid. In addition, genetic linkage maps were constructed, and genetic and physical distances were compared. The results showed variations in polymorphism density across different regions of the chromosomes. Together, the obtained results provide valuable insights into the genomic characteristics and structural variations of satsuma mandarin and related citrus varieties. These insights will lead to the further elucidation and improvement of citrus cultivars through genome breeding strategies.

5

Chromosomal-level genome assembly of Populus davidiana

Chen, L.; Jiang, Y.; Shi, T.; Jia, C.; Long, Z.; Zhang, X.; Sang, Y.; Liu, J.; Wang, J.

2023-07-12 genomics 10.1101/2023.07.11.548481 medRxiv

Top 0.1%

55.7%

Show abstract

Despite the population structure and genetic differentiation of P. davidiana has been reported, little is known about the P. davidiana genome characterization with the genomes assembled at the chromosome level. As one of the most widely distributed and ecologically important tree species in China, P. davidiana is an excellent resource to understand population adaption to climate and environment change. Here we present a high-quality assembly based on long-read sequences and Hi-C data. This assembly is assembled into 19 contiguous chromosomes which provides a powerful tool for future association studies.

6

Decoding the Centromeric Region with a Near Complete Genome Assembly of the Oshima Cherry Cerasus speciosa

Fujiwara, K.; Toyoda, A.; Biswa, B. B.; Kishida, T.; Tsuruta, M.; Nakamura, Y.; Kimura, N.; Kawamoto, S.; Sato, Y.; Katsuki, T.; Sakura 100 Genome Consortium, ; Koide, T.

2024-06-22 genomics 10.1101/2024.06.17.599445 medRxiv

Top 0.1%

54.4%

Show abstract

The Oshima cherry (Cerasus speciosa), which is endemic to Japan, has significant cultural and horticultural value. In this study, we present a near complete telomere-to-telomere genome assembly for C. speciosa, derived from the old growth "Sakurakkabu" tree on Izu Oshima Island. Using Illumina short-read, PacBio long-read, and Hi-C sequencing, we constructed a 269.3 Mbp genome assembly with a contig N50 of 32.0 Mbp. We examined the distribution of repetitive sequences in the assembled genome and identified regions that appeared to be centromeric. Detailed structural analysis of these putative centromeric regions revealed that the centromeric regions of C. speciosa comprised repetitive sequences with monomer lengths of 166 or 167 bp. Comparative genomic analysis with Prunus sensu lato genome revealed structural variations and conserved syntenic regions. This high-quality reference genome provides a crucial tool for studying the genetic diversity and evolutionary history of Cerasus species, facilitating advancements in horticultural research and the preservation of this iconic species.

7

Chromosomal-level genome assembly of Populus lasiocarpa

Long, Z.; Sang, Y.; Feng, J.; Shi, T.; Dan, X.; Zhang, Y.; Liu, J.; Wang, J.

2023-07-13 genomics 10.1101/2023.07.11.548483 medRxiv

Top 0.1%

52.2%

Show abstract

Despite widespread biodiversity loss, our understanding of how species and populations will respond to accelerated climate change remains limited. In this study, we predict the evolutionary responses of Populus lasiocarpa, a key alpine forest tree species primarily found in the mountainous regions of a global biodiversity hotspot, to climate change. We accomplish this by generating and integrating a new reference genome for P. lasiocarpa, re-sequencing data for 200 samples, and gene expression profiles for leaf and root tissue following exposure to heat and waterlogging. Analyses of the re-sequencing data indicate that demographic dynamics, divergent selection, and long-term balancing selection have shaped and maintained genetic variation within and between populations over historical timescales. In examining genomic signatures of contemporary climate adaptation, we found that haplotype blocks, characterized by inversion polymorphisms that suppress recombination, play a crucial role in clustering environmentally adaptive variations. Comparison of evolved and plastic gene expression show that genes with expression plasticity generally align with evolved responses, highlighting the adaptive role of plasticity. Lastly, we incorporated local adaptation, migration, genetic load, and plasticity responses into our predictions of population-level climate change risks. Our findings reveal that western populations, primarily distributed in the Hengduan Mountains--a region known for its environmental heterogeneity and significant biodiversity--are the most vulnerable to climate change and should be prioritized for conservation and management. Overall, our study advances understanding of the relative roles of long-term natural selection, local environmental adaptation, and immediate plasticity responses in driving evolutionary adaptation to climate change in keystone species.

8

SMRT sequencing generates the chromosome-scale reference genome of tropical fruit mango, Mangifera indica

Li, W.; Zhu, X.-G.; Zhang, Q.-J.; Li, K.; Zhang, D.; Shi, C.; Gao, L.-Z.

2020-02-23 genomics 10.1101/2020.02.22.960880 medRxiv

Top 0.1%

49.4%

Show abstract

Mango (Mangifera indica), a member of the family Anacardiaceae, is one of the worlds most popular tropical fruits. Here we sequenced the variety, "Hong Xiang Ya", and generated a 371.6-Mb mango genome assembly with 34,529 predicted protein-coding genes. Aided with the published genetic map, for the first time, we assembled the M. indica genome to the chromosomes, and finally about 98.77% of the genome assembly was anchored to 20 pseudo-chromosomes. The availability of the chromosome-length genome assembly of M. indica will provide novel insights into genome evolution, understand the genetic basis of specialized phytochemical composites relevant to fruit quality, and enhance allele mining in genomics-assisted breeding for mango genetic improvement.

9

Haploid-resolved and chromosome-scale genome assembly in hexa-autoploid sweetpotato (Ipomoea batatas (L.) Lam)

Yoon, U.-H.; Cao, Q.; Shirasawa, K.; Zhai, H.; Lee, T.-H.; Tanaka, M.; Hirakawa, H.; Hahn, J.-H.; Wang, X.; Kim, H. S.; Tabuchi, H.; Zhang, A.; Kim, T.-H.; Nagasaki, H.; Xiao, S.; Okada, Y.; Jeong, J. C.; Nagano, S.; Shin, Y.; Lee, H.-U.; Park, S.-U.; Lee, S. J.; Lee, K.; Yang, J.-W.; Ahn, B. O.; Ma, D.; Takahata, Y.; Kwak, S.-S.; Liu, Q.; Isobe, S.

2022-12-25 genomics 10.1101/2022.12.25.521700 medRxiv

Top 0.1%

45.8%

Show abstract

Sweetpotato (Ipomoea batatas (L.) Lam) is the worlds seventh most important food crop by production quantity. Cultivated sweetpotato is a hexaploid (2n = 6x = 90), and its genome (B1B1B2B2B2B2) is quite complex due to polyploidy, self-incompatibility, and high heterozygosity. Here we established a haploid-resolved and chromosome-scale de novo assembly of autohexaploid sweetpotato genome sequences. Before constructing the genome, we created chromosome-scale genome sequences in I. trifida using a highly homozygous accession, Mx23Hm, with PacBio RSII and Hi-C reads. Haploid-resolved genome assembly was performed for a sweetpotato cultivar, Xushu18 by hybrid assembly with Illumina paired-end (PE) and mate-pair (MP) reads, 10X genomics reads, and PacBio RSII reads. Then, 90 chromosome-scale pseudomolecules were generated by aligning the scaffolds onto a sweetpotato linkage map. De novo assemblies were also performed for chloroplast and mitochondrial genomes in I. trifida and sweetpotato. In total, 34,386 and 175,633 genes were identified on the assembled nucleic genomes of I. trifida and sweetpotato, respectively. Functional gene annotation and RNA-Seq analysis revealed locations of starch, anthocyanin, and carotenoid pathway genes on the sweetpotato genome. This is the first report of chromosome-scale de novo assembly of the sweetpotato genome. The results are expected to contribute to genomic and genetic analyses of sweetpotato.

10

Identification of active transposable element candidates from ROH in a de novo assembled chromosome-scale genome of a Nishikigoi, an ornamental fish derived from Common carp (Cyprinus carpio)

Hosaka, A.

2023-12-26 genomics 10.1101/2023.12.26.573356 medRxiv

Top 0.1%

39.6%

Show abstract

Transposable Elements (TEs) are major components of the genome. To understand their function and evolution, it is necessary to identify active TEs from a diverse range of organisms. Here, I report the genome of the Nishikigoi, an ornamental fish derived from the Common carp, and the novel approach to detecting active TE candidates. I constructed a chromosome-scale assembly using long-read sequencing and Hi-C methods. It revealed that Nishikigoi has Robertsonian-like chromosomal translocations not seen in Common carp. I also found that Nishikigoi has a significantly different genetic background from Common carp, reflecting the intensive breeding history. Furthermore, by focusing on Runs of Homozygosity (ROH) islands in the Nishikigoi genome and analyzing structural variations with long-read sequencing, I identified several active TE candidates. This study not only revealed the unique genetic features of Nishikigoi but also demonstrated the potential for a novel approach in the search for active TEs.

11

High-quality de novo genome assembly of Kappaphycus alvarezii based on both PacBio and HiSeq sequencing

Jia, S.; Wang, G.; Liu, G.; Qu, J.; Zhao, B.; Jin, X.; Zhang, L.; Yin, J.; Liu, C.; Shan, G.; Wu, S.; Song, L.; Liu, T.; Wang, X.; Yu, J.

2020-02-15 genomics 10.1101/2020.02.15.950402 medRxiv

Top 0.1%

39.6%

Show abstract

The red algae Kappaphycus alvarezii is the most important aquaculture species in Kappaphycus, widely distributed in tropical waters, and it has become the main crop of carrageenan production at present. The mechanisms of adaptation for high temperature, high salinity environments and carbohydrate metabolism may provide an important inspiration for marine algae study. Scientific background knowledge such as genomic data will be also essential to improve disease resistance and production traits of K. alvarezii. 43.28 Gb short paired-end reads and 18.52 Gb single-molecule long reads of K. alvarezii were generated by Illumina HiSeq platform and Pacbio RSII platform respectively. The de novo genome assembly was performed using Falcon_unzip and Canu software, and then improved with Pilon. The final assembled genome (336 Mb) consists of 888 scaffolds with a contig N50 of 849 Kb. Further annotation analyses predicted 21,422 protein-coding genes, with 61.28% functionally annotated. Here we report the draft genome and annotations of K. alvarezii, which are valuable resources for future genomic and genetic studies in Kappaphycus and other algae.

12

Slicing the genome of star-fruit (Averrhoa carambola L.)

Fan, Y.; Sahu, S. K.; Yang, T.; mu, w.; wei, j.; Cheng, L.; Yang, J.-l.; Xu, X.; Liu, X.; Mu, R.-c.; Liu, J.; Zhao, J.-m.; Zhao, Y.-x.; Liu, H.

2019-11-22 genomics 10.1101/851790 medRxiv

Top 0.1%

39.5%

Show abstract

The Averrhoa carambola is commonly known as star fruit because of its peculiar shape and its fruit is a rich source of minerals and vitamins. It is also used in traditional medicines in countries like India, China, the Philippines, and Brazil for treating various ailments such as fever, diarrhea, vomiting, and skin disease. Here we present the first draft genome of the Oxalidaceae family with an assembled genome size of 470.51 Mb. In total, 24,726 protein-coding genes were identified and 16,490 genes were annotated using various well-known databases. The phylogenomic analysis confirmed the evolutionary position of the Oxalidaceae family. Based on the gene functional annotations, we also discovered the enzymes possibly involved in the important nutritional pathways in star fruit genome. Overall, being the first sequenced genome in the Oxalidaceae family, the data provides an essential resource for the nutritional, medicinal, and cultivational studies for this economically important star-fruit plant.

13

Empowering medaka fish biology with versatile genomic resources in MedakaBase

Morikami, K.; Tanizawa, Y.; Yagura, M.; Sakamoto, M.; Kawamoto, S.; Nakamura, Y.; Yamaguchi, K.; Shigenobu, S.; Naruse, K.; Ansai, S.; Kuraku, S.

2025-05-16 genomics 10.1101/2025.05.13.653297 medRxiv

Top 0.1%

39.0%

Show abstract

Medaka, a group of small, mostly freshwater fishes in the teleost order Beloniformes, includes the rice fish Oryzias latipes, which is a prominent model organism for diverse biological fields. Chromosome-scale genome sequences of the Hd-rR strain of this species were obtained in 2007, and its improved version has facilitated various genome-wide studies. However, despite its widespread utility, omics data for O. latipes are dispersed across various public databases and lack a centralized platform. To address this, the medaka section of the National Bioresource Project (NBRP) of Japan established a genome informatics team in 2022 tasked with providing versatile in silico solutions for bench biologists. This initiative led to the launch of MedakaBase (https://medakabase.nbrp.jp), a web server that enables gene-oriented analysis including exhaustive sequence similarity searches. MedakaBase also provides genome-wide browsing of diverse datasets, including tissue-specific transcriptomes and intraspecific genomic variations, integrated with gene models from different sources. Additionally, the platform offers gene models optimized for single-cell transcriptome analysis, which often requires coverage of the 3' untranslated region (UTR) of transcripts. Currently, MedakaBase provides genome-wide data for seven Oryzias species, including original data for O. mekongensis and O. luzonensis produced by the NBRP team. This article outlines technical details behind the data provided by MedakaBase.

14

Genome assembly reconstruction of the Japanese honey bee, Apis cerana japonica (Hymenoptera: Apidae), using homology-based assembly and nanopore long-reads

Masuoka, Y.; Jouraku, A.; Kuwazaki, S.; Yoshiyama, M.; Horigane-Ogihara, M.; Maeda, T.; Suzuki, Y.; Bono, H.; Kimura, K.; Yokoi, K.

2023-07-26 genomics 10.1101/2023.07.26.550500 medRxiv

Top 0.1%

38.5%

Show abstract

Honey bees are important for agriculture (e.g., pollination and honey production). Additionally, honey bees are an important insect model species, especially as model social insects. The Japanese honey bee, Apis cerana japonica (a subspecies of the Asian honey bee, Apis cerana), is a Japanese domestic honey bee, which has several subspecies-specific traits. We previously constructed the draft genome sequence data of A. cerana japonica, but it needed to be improved considering the use of the genome sequence data for genome structural analysis and repetitive region analysis, as well as the availability of chromosome-level genome data of A. mellifera and A. cerana. In this study, we constructed the improved A. cerana japonica genome data and new gene set data with functional annotations. The constructed genome data, including 16 pseudochromosomes, was found to be highly contiguous and complete, and the gene set data covered most of the core genes in the BUSCO database. Thus, the constructed genome and gene set data have become more suitable as the reference data of A. cerana japonica.

15

Chromosome scale reference genome of Cluster bean (Cyamopsis tetragonoloba (L.) Taub.)

Gaikwad, K.; Ramakrishna, G.; Srivastava, H.; Saxena, S.; Kaila, T.; Tyagi, A.; Sharma, P.; Sharma, S.; Sharma, R.; Mahla, H.; SV, A. M.; Solanke, A.; Kalia, P.; Rao, A.; Rai, A.; Sharma, T.; Singh, N.

2020-05-18 genomics 10.1101/2020.05.16.098434 medRxiv

Top 0.1%

36.0%

Show abstract

Clusterbean (Cyamopsis tetragonoloba (L.) Taub.), also known as Guar is a widely cultivated dryland legume of Western India and parts of Africa. Apart from being a vegetable crop, it is also an abundant source of a natural hetero-polysaccharide called guar gum or galactomannan which is widely used in cosmetics, pharmaceuticals, food processing, shale gas drilling etc. Here, for the first time we are reporting a chromosome-scale reference genome assembly of clusterbean, from a high galactomannan containing popular guar cultivar, RGC-936, by combining sequenced reads from Illumina, 10x Chromium and Oxford Nanopore technologies. The initial assembly of 1580 scaffolds with an N50 value of 7.12 Mbp was generated. Then, the final genome assembly was obtained by anchoring these scaffolds to a high density SNP map. Finally, a genome assembly of 550.31 Mbp was obtained in 7 pseudomolecules corresponding to 7 chromosomes with a very high N50 of 78.27 Mbp. We finally predicted 34,680 protein-coding genes in the guar genome. The high-quality chromosome-scale cluster bean genome assembly will facilitate understanding of the molecular basis of galactomannan biosynthesis and aid in genomics-assisted breeding of superior cultivars.

16

Chromosome-scale assembly and annotation of the wildwheat relative Aegilops comosa

Li, H.; Rehman, S. u.; Song, R.; Qiao, L.; Hao, X.; Zhang, J.; Li, K.; Hou, L.; Hu, W.; Wang, L.; Chen, S.

2024-10-17 genomics 10.1101/2024.10.15.618371 medRxiv

Top 0.1%

34.8%

Show abstract

Wild relatives of wheat are valuable sources for enhancing the genetic diversity of common wheat. Aegilops comosa, an annual diploid species with an MM genome constitution, possesses numerous agronomically valuable traits that can be exploited for wheat improvement. In this study, we report a chromosome-level genome assembly of Ae. comosa accession PI 551049, generated using PacBio high-fidelity (HiFi) reads and high-throughput chromosome conformation capture (Hi-C) data. The assembly spans 4.47 Gb, featuring a contig N50 of 23.59 Mb and a scaffold N50 of 619.05 Mb. A total of 39,063 gene models were annotated through a combination of homoeologous proteins, Iso-Seq, and RNA-Seq data. Comparative genome analysis revealed a terminal intrachromosomal translocation in chromosome 2M of Ae. comosa (and Ae. umbellulata) compared to its homoeologous chromosomes in other diploid wheat species. Phylogenetic analysis showed a close relationship between Ae. comosa and Ae. umbellulata. This newly constructed reference genome of Ae. comosa will serve as an important genomic resource for comparative genomic studies and the cloning of agriculturally important genes.

17

Long-read genome assembly of the Japanese parasitic wasp Copidosoma floridanum (Hymenoptera: Encyrtidae)

Toga, K.; Sakamoto, T.; Kanda, M.; Tamura, K.; Okuhara, K.; Tabunoki, H.; Bono, H.

2023-09-25 genomics 10.1101/2023.09.24.559078 medRxiv

Top 0.1%

34.0%

Show abstract

Copidosoma floridanum is a cosmopolitan species and an egg-larval parasitoid of the Plusiine moth. C. floridanum has a unique development mode called polyembryony, in which thousands of genetically identical embryos are produced from a single egg. Some embryos develop into sterile soldier larvae, and their developmental patterns differ between the US and Japanese C. floridanum strains. Genome sequencing can accelerate our understanding of the molecular bases underlying polyembryony, including the production of soldier castes. However, only the genome sequence of the US strain has been reported. In the present study, we determined the genome sequence of the Japanese strain using Pacific Biosciences high-fidelity reads and generating a highly contiguous assembly (552.7 Mb, N50: 17.9 Mb). Gene prediction and annotation identified 13,886 transcripts derived from 10,786 gene models. We searched the genomic differences between US and Japanese strains. Among gene models predicted in this study, 100 gene loci in the Japanese strain had extremely different gene structure from those in the US strain. This was accomplished through the functional annotation (GGSEARCH) and long-read sequencing. Genomic differences between strains were also reflected to amino acid sequences of vasa that plays a central role in caste determination in this species. The genome assemblies constructed in this study will facilitate the genomic comparisons between Japanese and US strains, leading to our understanding of detail genomic regions responsible for the ecological and physiological characters of C. floridanum.

18

First haplotype-resolved genome assembly of citral-rich lemongrass Cymbopogon flexuosus var. Krishna

Tyagi, S.; Gupta, V.; Verma, S.; Negi, N. P.; Kumar, S.; Trivedi, P. K.

2026-02-18 genomics 10.64898/2026.02.17.706310 medRxiv

Top 0.1%

33.7%

Show abstract

Cymbopogon flexuosus var. Krishna (lemongrass) is an aromatic grass valued for its high citral content, which is widely used in the fragrance, flavor, and pharmaceutical industries. C. flexuosus, a member of the Poaceae family, is a predominantly outcrossing species characterized by a highly heterozygous genome. Despite its economic importance and widespread cultivation, a high-quality reference genome has been lacking. Here, we report the first chromosome-scale genome assembly of lemongrass, generated using PacBio HiFi long-read sequencing combined with Omni-C chromatin conformation capture data. The resulting pseudo-haploid assembly spans approximately 798 Mb, organized into 10 chromosomes, and exhibits a scaffold N50 of 64.35 Mb. The assembly demonstrates high completeness, with 99.8% BUSCO recovery, and comprises [~]37,254 predicted protein-coding genes. In addition, we generated haplotype-resolved assemblies that capture the allelic diversity of this heterozygous genome. The haplotypes have sizes of [~]750 Mb and [~]726 Mb, representing 95-98% of the pseudo-haploid genome, and together they provide phase-resolved information for gene families and biosynthetic pathways. These high-quality assemblies establish a foundational genomic resource for advancing molecular breeding, comparative genomics, and metabolic engineering of lemongrass and related aromatic grasses.

19

Draft genome of a porcupinefish, Diodon Holocanthus

Xu, M.; Su, X.; Zhang, M.; Li, M.; Huang, X.; Fan, G.; Liu, X.; Zhang, H.

2019-09-20 genomics 10.1101/775387 medRxiv

Top 0.1%

31.5%

Show abstract

The long-spine porcupinefish, Diodon holocanthus (Diodontidae, Tetraodontiformes, Actinopterygii), also known as the freckled porcupinefish, attracts great interest of ecology and economy. Its distinct characteristics including inflation reaction, spiny skin and tetradotoxin, however, have not been fully studied without a complete genome assembly.\n\nIn this study, the whole genome of a single individual was sequenced using single tube-Long Fragment Read co-barcode reads, generating 154.3 Gb of paired-end data (219.8x depth). The gap was further filled using small amount of Oxford Nanopore MinION long read dataset (11.4Gb, 15.9x depth). Taking full use of long, medium, short-range of genome assembly information, the final assembled sequences with a total length of 650.02 Mb obtained contig and scaffold N50 sizes of 2.15 Mb and 8.13 Mb, respectively, despite of high repetitive content. Benchmarking Universal Single-Copy Orthologs captured 95.7% (2,474) of core genes to assess the completeness. In addition, 206.5 Mb (32.10%) of repetitive sequences were identified, and 20,840 protein-coding genes were annotated, among which 18,281 (87.72%) proteins were assigned with possible functions.\n\nThis is the first demonstration of de novo genome of the porcupinefish, which will benefit downstream analysis of ontogeny, phylogeny, and evolution, and improve the exploration of its unique defensive mechanism.

20

A chromosome-scale strawberry genome assembly of a Japanese variety, Reikou

Shirasawa, K.; Hirakawa, H.; Nakayama, S.; Sasamoto, S.; Tsuruoka, H.; Minami, C.; Watanabe, A.; Kishida, Y.; Kohara, M.; Yamada, M.; Fujishiro, T.; Isobe, S. N.

2021-04-23 genomics 10.1101/2021.04.23.441065 medRxiv

Top 0.1%

26.6%

Show abstract

Cultivated strawberry (Fragaria x ananassa) is an octoploid species (2n = 8x= 56) that is widely consumed around the world as both fresh and processed fruit. In this study, we report a chromosome-scale strawberry genome assembly of a Japanese variety, Reikou. The Illumina short reads derived from paired-end, mate-pair, and 10X Genomics libraries were assembled using Denovo MAGIC 3.0. The generated phased scaffolds consisted of 32,715 sequences with a total length of 1.4 Gb and an N50 length of 3.9 Mb. A total of 63 pseudomolecules including chr0 were created by aligning the scaffolds onto the Reikou S1 linkage maps with the IStraw90 Axiom SNP array and ddRAD-Seq. Meanwhile, genomes of diploid Fragaria species were resequenced and compared with the most similar chromosome-scale scaffolds to investigate the possible progenitor of each subgenome. Clustering analysis suggested that the most likely progenitors were F. vesca and F. iinumae. The phased pseudomolecules were assigned the scaffolds names with Av, Bi, and X, representing sequence similarity with F. vesca (Av), F. iinumae (Bi), and others (X), respectively. The result of a comparison with the Camerosa genome suggested the possibility of subgenome structure differences between the two varieties.